A distributed packed storage for large dense parallel in-core calculations
نویسندگان
چکیده
We propose in this paper a distributed packed storage format that exploits the symmetry or the triangular structure of a dense matrix. This format stores only half of the matrix while maintaining most of the efficiency compared to a full storage for a wide range of operations. This work has been motivated by the fact that, contrary to sequential linear algebra libraries (e.g. LAPACK [4]), there is no routine, no format that handles packed matrices in the current parallel distributed libraries available. The proposed algorithms exclusively use the existing ScaLAPACK [6] computational kernels which proves the generality of the approach, provides easy portability of the code, efficient re-use of existing software. The performance results obtained for the Cholesky factorization show that our packed format performs as good or better than the ScaLAPACK full algorithm for small numbers of processors. For larger number of processors, the ScaLAPACK full storage routine performs slightly better until each processor runs out of its memory.
منابع مشابه
An efficient distributed randomized algorithm for solving large dense symmetric indefinite linear systems
Randomized algorithms are gaining ground in high-performance computing applications as they have the potential to outperform deterministic methods, while still providing accurate results. We propose a randomized solver for distributed multicore architectures to efficiently solve large dense symmetric indefinite linear systems that are encountered, for instance, in parameter estimation problems ...
متن کاملA Dense Out-of-Core Solver (DOCS) for Complex-Valued Linear Systems
Dense systems of linear equations are quite common in many science and engineering applications. Such linear systems place extreme storage and computational demands on computer resources and, in many cases, may severely limit the subsequent analysis. A dense out-of-core solver (DOCS) that operates on a partitioned coefficient matrix can reduce the in-core storage requirements of the linear syst...
متن کاملStatic Task Allocation in Distributed Systems Using Parallel Genetic Algorithm
Over the past two decades, PC speeds have increased from a few instructions per second to several million instructions per second. The tremendous speed of today's networks as well as the increasing need for high-performance systems has made researchers interested in parallel and distributed computing. The rapid growth of distributed systems has led to a variety of problems. Task allocation is a...
متن کاملDepartment of Computer Science Technical Report CS - 97 - 347 Packed storage extension for ScaLAPACK
We describe a new extension to ScaLAPACK [2] for computing with symmetric (Hermi-tian) matrices stored in a packed form. The new code is built upon the ScaLAPACK routines for full dense storage for a high degree of software reuse. The original ScaLAPACK stores a symmetric matrix as a full matrix but accesses only the lower or upper triangular part. The new code enables more efficient use of mem...
متن کاملDepartment of Computer Science Technical Report CS - 98 - 385 Packed storage extension for ScaLAPACK
We describe a new extension to ScaLAPACK [2] for computing with symmetric (Hermi-tian) matrices stored in a packed form. The new code is built upon the ScaLAPACK routines for full dense storage for a high degree of software reuse. The original ScaLAPACK stores a symmetric matrix as a full matrix but accesses only the lower or upper triangular part. The new code enables more efficient use of mem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Concurrency and Computation: Practice and Experience
دوره 19 شماره
صفحات -
تاریخ انتشار 2007